A Universal Phoneme-Set Based Language Independent Short Utterance Speaker Recognition
نویسندگان
چکیده
In the field of speaker recognition, short utterance speaker recognition (SUSR) has been attracting more and more attention in recent years. Despite the advancement in this technology and use of phonetic cues for speaker recognition, the role of individual phonemes in carrying speaker information is yet quite an open issue. This paper presents a novel idea of using phoneme classes as a basis for SUSR. For the present work, we have restricted ourselves to vowel classes and defined combined vowel classes in two languages, i.e. English and Chinese. These sets are used to develop the universal background phoneme-class model (UBPM) and then for training and testing over conventional GMM-UBM systems. Experimental results have proved that speech segments, as short as phonemes, are surprisingly important areas that carry useful speaker information.
منابع مشابه
A Multi-Model Method for Short-Utterance Speaker Recognition
The length of the test speech greatly influences the performance of GMM-UBM based text-independent speaker recognition system, for example when the length of valid speech is as short as 1~5 seconds, the performance decreases significantly because the GMM-UBM based speaker recognition method is a statistical one, of which sufficient data is the foundation. Considering that the use of text inform...
متن کاملSpeech unit category based short utterance speaker recognition
Information of speech units like vowels, consonants and syllables can be a kind of knowledge used in text-independent Short Utterance Speaker Recognition (SUSR) in a similar way as in textdependent speaker recognition. In such tasks, data for each speech unit, especially at the time of recognition, is often not enough. Hence, it is not practical to use the full set of speech units because some ...
متن کاملAssamese Vowel Phoneme Recognition Using Zero Crossing Rate and Short-time Energy
Speaker recognition is the identification of the person who is speaking by the characteristics of their voices. Assamese is a Indo-Aryan family of languages, mainly spoken in the North-Eastern of India. In this paper text dependent speaker modelling technique is used. The system contains training phase, the testing phase and the recognition phase. The database consists of utterance of 10 speake...
متن کاملSpectral Characteristics of Vocal Tract for Speaker Recognition
The basic idea of the presented approach is to evaluate a spectral characteristics corresponding to the anatomy of the speaker ́s vocal tract independently of the actually pronounced phoneme. The procedure for determining the speaker-specific average spectrum is based on the LPC approach. Experimental results have shown an evolution in a long-time spectrum with respect to the duration of text in...
متن کاملText-Independent Speaker Verification via State Alignment
To model the speech utterance at a finer granularity, this paper presents a novel state-alignment based supervector modeling method for text-independent speaker verification, which takes advantage of state-alignment method used in hidden Markov model (HMM) based acoustic modeling in speech recognition. By this way, the proposed modeling method can convert a text-independent speaker verification...
متن کامل